<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title>Notes for vil</title>
<style>code {color: #FF0000;}
</style>
</head>

<body>
<h1>Notes for vil/vil2</h1>

<h2>Good points about existing vil1</h2>
<ul>
  <li>The functional notation
  <ul>
    <li><code>vil1_image im = vil1_load("filename")</code></li>
  </ul>
  </li>
  <li>Support for processing very large images on disk.</li>
  <li>Support for lots of image types</li>
</ul>
<h2>Problems with existing vil1</h2>
<ul>
  <li>It doesn't support planes very well.</li>
  <li>Pointer indexing rather than arithmetic indexing
  <ul>
    <li>Pointers are slower than arithmetic on most modern platforms</li>
    <li>It is hard to build a 2d image from a slice of 3d data</li>
  </ul>
  </li>
  <li>The <code>vil1_image</code> smart pointer stuff is confusing.
  <ul>
    <li>Is it a view, is it a smart ptr, is it a smart smart ptr?</li>
  </ul>
  </li>
</ul>
<h2>There are some fundamental divisions</h2>
<table border="0" cellpadding="0" cellspacing="0" style="border-collapse: collapse"
 bordercolor="#111111" width="100%" id="AutoNumber1">
  <tr>
    <td width="33%" align="right">data directly accessible</td>
    <td width="3%" align="center">v</td>
    <td width="34%">data downloadable</td>
  </tr>
  <tr>
    <td width="33%" align="right">data type known at compile time</td>
    <td width="3%" align="center">v</td>
    <td width="34%">data type known at run time</td>
  </tr>
  <tr>
    <td width="33%" align="right">data ordering known a priori</td>
    <td width="3%" align="center">v</td>
    <td width="34%">data ordering known after image loading</td>
  </tr>
</table>
<p>Do these not indicate that trying to have a single image type (or hierarchy) is incorrect?
Does the single <code>vil1_image</code> hierarchy actually give you anything?</p>
<p>vil merges all the assumptions on the left in <code>vil_image_view&lt;T&gt;</code>,
and those on the right in <code>vil_image_resource</code></p>
<h2>Background</h2>
<p>Before ISBE, Manchester adopted VXL we used an image type which provided</p>
<ol>
  <li>Good plane support</li>
  <li>Arithmetic indexing, exposed in the interface</li>
  <li>A world to image transform</li>
  <li>A polymorphic class hierarchy whose base class could represent images of
  any dimension.</li>
</ol>
<p>We heavily used all these features, none of which were available in vil1. To
get round this problem we wrote our own public VXL-compatible image library -
mul/mil. It did all the above stuff, using vil1 to do the file loading and
saving. We also designed the library in terms of views of images. We first
informally raised the idea of rewriting vil1 to support our code at the Zurich
VXL meeting in March 2001, with encouraging responses. After spending 18 months
noticing that we were duplicating more and more code that was available in vil1,
the short-notice arrangement of a meeting in Providence on the 27th Sep 2002,
and the encouraging comments from other VXL members gave us the spur to design
this proposal.</p>
<h2>Philosophy</h2>
<p>We want a single "normal" and "easy" image class that we can point new users at.
This should also be the default class for doing actual pixel manipulation.</p>
<p>We want to keep the type proliferation down. In particular, we do not want to
have to write or maintain <code>vil_image_process_func()</code> for more image
types than absolutely necessary. So we enforce the following restriction in
vil's API</p>
<ul>
  <li><code>vil_image_resource_sptr</code> is used when you don't know an image's pixel type,
and you can't immediately access the pixels.</li>
  <li><code>vil_image_view&lt;T&gt;</code> is used when you know the type,
you know the structure and you can immediately access the pixels</li>
</ul>
<p>By keeping this
division clean we can avoid type proliferation. There are limited exceptional cases. One is
when you absolutely must have a function that works on a view without knowing its pixel type - e.g.
<code>vil_load(&quot;filename&quot;)</code> which returns a <code>vil_image_view_base_sptr</code>.</p>
<p>This division leads naturally to a division of labour.</p>
<ul>
  <li>A <code>vil_image_process_func() </code>that actually does the pixel value
    calculations should
take a concrete <code>vil_image_view&lt;T&gt;</code>.</li>
  <li>If someone wants a super-duper <code>vil_image_process_func() </code>(size-no-problem,
    runtime pixel-polymorphic, etc) then write a function that takes
<code>vil_image_resource_sptr</code>. However sub-contract the actual pixel processing
    to <code>vil_image_process_func(vil_image_view&lt;T&gt;)</code>.</li>
</ul>
<p>A <code>vil_image_view&lt;T&gt;</code>  is a <i>view</i> in the sense that it
provides a <i>view</i> of some image data. You can make concrete copies of views
cheaply, without needing to copy the underlying image data. As a service the
view memory-manages the image data as well, so you don't need to worry about the
underlying data at all.</p>
<p><code>vil_image_resource</code> is the abstract base of polymorphic class
tree. The design is such that most work can be done on the base class API. You
can cast it to its concrete type if you need to set some obscure property. To get
a functional style, most operations on a
<code>vil_image_resource</code>, should be on a <code>vil_image_resource_sptr</code>.</p>
<p>We intend to uphold this API design rigorously. Whilst we have no intention
of preventing users from doing whatever they like with their own code, we
believe that the current design is locally optimal within a wide radius of
convergence. We think that any VXL level-2 libraries should use these API
guidelines as well.</p>
<h2>Design decisions:</h2>
<h4><code>vil_image_view&lt;T&gt;::set_size(..)</code> should be virtual and in the base class</h4>
<p>It is
  in <code>mil_image_2d</code>. Resizing can be slow so virtual function is no problem. It can
be useful to set the size of a image without knowing its type. Decision IMS and
TFC.</p>
<h4><code>vil_memory_chunk</code> should not be templated.</h4>
<p>You could have a memory chunk containing rgb, and choose to view it as
a <code>vil_image_view&lt;vil_rgb&lt;vxl_byte&gt;&gt;</code> with nplanes=1, or
<code>vil_image_view&lt;char&gt;</code> with nplanes =3; It
is hard to get the type of the smart pointers correct without linking the types
T of the <code>vil_memory_chunk&lt;T&gt;</code> and <code>vil_image_view&lt;T&gt;</code>.
We can put a image type in a member variable if it is necessary (e.g. for vsl
IO). Decision IMS and TFC.</p>
<h4>Use (i, j, p) default names for index parameters, and <code>ni()</code>,
<code>nj()</code>, <code>nplanes()</code>, for
sizes.</h4>
<p>Want to avoid use of (x,y) to avoid giving the impression that pixels are in a
Cartesian co-ord system. It is good to link index name and size name to avoid
for loop mistakes. Alternative was row, col, plane, and rows, cols, and planes.
Inner loops should have short variable names. Decision at Providence meeting.</p>
<h4>vil_image_view&lt;T&gt; should have a sparse interface.</h4>
<p>Methods like <code>set_to_window(..)</code>, <code>flip_ud()</code>,
<code>transpose()</code>,
can all be provided as standalone functions without much addition overhead.
Class API's should be small rather than large. Use Doxygen's <code>\relates</code> option
to help users find related functions. Decision IMS.</p>
<h4>Using arithmetic indexing scheme</h4>
<p><code>i.e.<br>
vil_image_view&lt;T&gt;::operator(i,j,p) {return *(top_left_ptr + i * istep + j* jstep + p*planestep);}</code><br>
Don't use pointer indexing.<br>
<code>vil_image_view&lt;T&gt;::operator(i,j) {return raster_ptrs[j][i];}<br>
</code>Multiplication is fast on modern processors. Pointers require extra
memory lookup which is slower, and can reduce cache hit rate. Pointers do not
generalise well to planes, or 3D. Decision confirmed at Providence meeting.</p>
<h4>Best order for index parameters is i,j,plane.</h4>
<p>There is no preferred indexing order with arithmetic indexing. You don't know
whether the planestep is 1 or istep is 1. However, for ease of use it is worth
having a consistent interface. Two alternatives are (plane, i, j) and (i, j, plane).
The former is probably more natural, however, the latter allows us to use
default argument plane=0, which is very useful for keeping interface clutter
down. Decision at Providence meeting.</p>
<h4>Use <code>vil_image_view_base_sptr</code> in preference to <code>vil_image_view_base *</code>.</h4>
<p>Main uses of this are in <code>vil_load(..)</code> and <code>vil_image_resource::get_view(..)</code>.
These functions cannot return a real vil_image_view&lt;T&gt; because that would
involve knowing T in advance. They cannot return references because the real
object will get destroyed as the function ends before it is assigned to a
variable in the calling level. We use the smart over the raw pointer to avoid a
likely source of memory leaks. However, most code in vil should operate on
<code>vil_image_view&lt;T&gt;</code>.
Decision at Providence meeting.</p>
<h4><code>vil_image_view&lt;T&gt;</code> should not be derived from <code>vil_image_resource</code></h4>
<p>As explained above these are actually two different types. Efficient access
to ni, nj for <code>vil_image_view&lt;T&gt;</code> whilst matching interface for
<code>vil_image_resource</code> would require a lot of complexity.
Decision confirmed at Providence meeting.</p>
<h4>There should  be no general <code>vil_image_resource::set_properties(..)</code>?</h4>
<p>Its existence would imply the ability to set arbitrary properties. You can always have a
specialist <code>set_obscure_tiff_property(..)</code> as a member function of
<code>vil_tiff_file_image</code>. Decision by IMS and AWF. There will be a
<code>vil_image_resource::get_properties(..)</code> with all properties documented
in <code>vil/vil_properties.h</code>. This allows for partially shared
properties such as bits per component or physical pixel height
</p>
<h4><code>vil_image_resource::get_view(..)</code>, etc. will not allow you to specify planes.</h4>
<p>It is easy to split off individual planes later. Although plane
specifications could give
you a memory advantage -  a factor of 3 is not big enough to be useful
for v. large images. Not having planes specifications can make programming
<code>vil_file_image_resource::get_view(..)</code>,
etc. a lot easier and potentially faster.
Decision at Providence meeting.</p>
<h4>Index and size types should be <code>unsigned int</code>.</h4>
<p>Signed has the advantage that it is much easier to pick up -ne overflow
errors. Unsigned has the advantage that it is more natural, makes the fixed 0,0
origin clear, and it reduces the number of assertion checks needed (admittedly
by exposing you to the -ne overflow errors:) We can catch some -ne overflow
errors by <code>assert&nbsp;(i != (unsigned)(-1))</code>. Decision at Providence
meeting.</p>
<h4>Use function overloading and normal filename prefix for functions on <code>
vil_image_view&lt;T&gt;</code>, and functions on
<code>vil_image_resource</code> objects.</h4>
<p>Where function overloading doesn't discriminate (e.g.
<code>vil_load(&quot;filename&quot;)</code> ) then use
<code>vil_load_image_resource(&quot;filename&quot;)</code>
for <code>vil_image_resource</code> objects. Decision IMS and TFC</p>
<h4><code>vil_image_view&lt;T&gt;::set_size(ni, nj)</code> will make no changes if size is same.
It will detach from original underlying data and create new data if different.</h4>
<p>This has some odd effects. If you have multiple views to the same data then <code>
set_size</code> may or may not move view to different data. However, it is very useful
when treating views as ordinary images, and allows very efficient use of
workspace images. For example multiple calls to an image processing class often
need identically sized image workspaces. Using this <code> set_size</code> means that it doesn't
have to reallocate memory on a regular basis. <code>mil_image</code> has been
using this design for 18 months with no problems. Decision at Providence meeting.</p>
<h4><code>vil_image_view&lt;T&gt;::operator=()</code> can be smart but should do fast view transforms only.</h4>
<p>Allowing <br>
<code>vil_image_view&lt;vil_rgb&lt;vxl_byte&gt; &gt; im_rgb(ni, nj);<br>
vil_image_view&lt;vxl_byte&gt; im_planes = im_rgb;</code><br>
and<br>
<code>vil_image_view&lt;vil_rgb&lt;vxl_byte&gt;&gt; pnm = vil_load(&quot;filename.ppm&quot;);</code><br>
makes for easy use. However , expensive operations should not have short
names. So this should not include doing conversions such as byte to float
conversion. These conversions should be done by separately named functions.
Decision at Providence meeting.</p>
<h4><code>vil_image_view&lt;T&gt;::operator=()</code> should take base class and base class
sptr.</h4>
<p>Equals is defined as <code>vil_image_view&lt;T&gt;::operator=(const
vil_image_view_base &amp;</code>. This is an operator= from its own base class.
This doesn't appear to cause any problems, and allows the above fast view
transformations between pixel types. Additionally
<code>vil_image_view&lt;T&gt;::operator=(const vil_image_view_base_sptr &amp;)</code>
allows for the simple conversion from the return value of many functions
such as vil_load. These need to return base class sptrs to reduce memory leaks
and because their interface cannot know the pixel type. Decision IMS</p>
<h4>Should <code>vil_load</code>  should return a base class sptr.</h4>
<p>(Similarly <code>vil_image_resource::get_view(..)</code>)
They could return <code>vil_image_view_base *</code> or <code>vil_image_view_sptr</code>
Both appear to work. The smart pointer version has the advantage of not needing
users to be careful about deletion. Decision by consensus on the reflector.</p>
<h4>All pixel types should be explicit about type length (e.g. <code>vxl_uint_16</code>)</h4>
<p>Alternative is to have <code>vil_image_resource::get_view()</code> automatically
pick the best standard type for that platform (e.g. short) The problem with this
becomes clear on 64 bit platforms. What types should 16bit or 32bit data get
loaded into (The same platform can't define short as both.). Decision at
Providence meeting. Question - do the float types need to be specific or can
we assume IEEE standard lengths?&nbsp;</p>
<h4>All operations will default to scalar pixels and multiple planes.
<code>vil_image_resource</code> hierarchy will mostly work only with scalar pixels.</h4>
<p>Nothing prevents you from using <code>vil_rgb</code> pixels, and we will
provide appropriate support. However planes are more general than components,
and <code>vil_image_resource</code> hierarchy can be simpler if it only thinks
about scalar pixels. In particular it may not be feasible to consider every
possible compound pixel type in a <code>switch()</code> statement, but it is
feasible to consider every possible scalar pixel type.
Decision confirmed at Providence meeting.</p>
<h4>Base class image type polymorphic in dimension and pixel type, and containing
world to image transforms will be provided in a separate level2 library - vimt.</h4>
<p>vimt benefits from being able to use both vil and vgl (and possibly vnl).
Decision TFC.</p>
<h4>Deal with const pointer data issues pragmatically inside vil.</h4>
<p>Pointers to data are often passed in as const, but may have the const removed.
This is because it is rather complicated to keep track of whether the data is going to be
changed or not.  Just try to be sensible, OK? Decision TFC.</p>
<h4>All standard image processing algorithms will go into vil/algo</h4>
<p>We will dump the vipl interface. All image processing code would be written
in terms of <code>vil_image_view&lt;T&gt;</code> first. Then write another function
which uses the first function, but does all the stitching stuff. Decision at
Providence meeting.</p>
<h4>All vil/algo image processing functions will have their input images as the
first parameters.</h4>
<p>e.g. <code>vil_sobel&lt;T&gt;(const vil_image_view&lt;T&gt;&amp;source, vil_image_view&lt;T&gt;
&amp;dest_i, vil_image_view&lt;T&gt; &amp;dest_j);</code></p>
<p>This is different from the operator=(..) style, but is more common in other
function libraries. Decision at Providence meeting.</p>
<h4>Best names are <code>vil_image_view, vil_image_resource</code></h4>
<p>A beginner will expect to
see the word image in the main image type. However it aids understanding to be
reminded that this is a view so <code>vil_image_view</code>. <code>vil_image_resource</code>
was originally <code>vil_image_data</code>. However that was thought to be
confusing and might get confused with the <code>vil_memory_chunk</code> class.
So <code>vil_image_resource</code>. Decision at Providence meeting and by vote
on vxl-maintainers.</p>
<h4>Use <code>set_size(..)</code> rather than <code>resize(..)</code>.</h4>
<p>The STL has introduced the notion of a data preserving resize. Since our code
doesn't do that, we use <code>set_size(..)</code> instead. Decision
Amitha and consensus on the reflector.</p>
<h4>raster and plane step types should be <code>vcl_ptrdiff_t</code>.</h4>
<p>This is strictly the correct type to add to a pointer to get another pointer.
It isn't signed int on platforms with 64-bit address bus but 32-bit data. Decision
IMS.</p>
<h4><code>vcl_complex&lt;float&gt;</code> is a scalar pixel type.</h4>
<p>Whilst useful under certain circumstances, it is not sensible default
behaviour to allow vil to treat a complex&lt;float&gt; as a two plane float
image. Therefore <code>vil_image_resource</code> derivatives should treat
complex&lt;float&gt; as ones of its default pixel types. Decision - FW and IMS.</p>
<h4><code>vil_image_view&lt;T&gt;</code> should have a constructor which allows
interleaved or separated plane storage.</h4>
<p>This is the single most useful specialist memory format for a view, and so
deserves to be easily created. Decision PVr and IMS.</p>
<h4>There should be versions of <code>vil_convert_*()</code> which take and
return <code>vil_image_view_base_ptr</code>s.</h4>
<p>There are already versions which take <code>vil_image_view&lt;T&gt;</code>s.
However when loading an image it is often useful to force it into a particular
pixel-type, arrangement, etc. We cannot assign it to a concrete <code>vil_image_view&lt;T&gt;</code>
because that will fail if the types are incompatible. If we want to allow <code>vil_load()</code>
to remain easy to use then we cannot solve this problem by providing <code>vil_convert(vil_image_resource_sptr)</code>.
This is the 3rd place where we allow the use of <code>vil_image_view_base_ptr</code>s
(after <code>vil_load()</code> and <code>vil_image_resource::get_view()</code>.)
Decision IMS and TFC.</p>
<p>n.b. consistency suggests providing <code>vil_convert(vil_image_resource_sptr)</code>
versions.</p>
<h4>file resources should return pixel values in least significant bit position.</h4>
<p>File types that have odd pixel widths (e.g. DICOM can use 12-bit images)
should not scale those pixels to fill the full range of the C++ pixel-type that
they return. So our 12-bit DICOM image may return vxl_uint_16 pixels with values
from 0 to 4095. Decision IMS, TFC and Marc Laymon (GE).</p>
<p>&nbsp;</p>
<h2>Remaining questions</h2>
<p>There is now no way to get a bit-compressed pbm file into memory efficiently. Is this a problem?
We could solve it by deriving a <code>vil_image_view_of_compressed_bits</code>, or similar.</p>
<h2>Possible improvements</h2>
<h4>Add a <code>get_pixel(void&nbsp;*buf,i,j)</code> and
<code>put_pixel(void&nbsp;*buf,i,j)</code> to <code>vil_image_resource.</code></h4>
<p>These might be useful for efficient implementation of some algorithms. They
could be implemented in terms of <code>get_copy_view(..)</code> and
<code>put_view(..)</code> in the base class. However, in some cases you need to
probe/set a pixel at a time and the overhead of allocating a <code>vil_image_view&lt;T&gt;</code>
for each <code>get_view(..)</code> is quite expensive. Note that the cost of
allocating the <code>vil_memory_chunk</code> is irrelevant. If the image is
already in memory, no new <code>vil_memory_chunk</code> is allocated. If the
image isn't already in memory, the cost of the new memory chunk is smaller than
the disk access time. Note, one alternative may be to bring back the <code>
vil_image_resource::fill_view(...)</code> which modifies the contents of an existing
view and doesn't allocate any memory. IMS.</p>
<h4>Add block processing implementation to the <code>vil_image_resource</code>
derivatives for very large images.</h4>
<p>The basic idea of dealing with large images was designed into vil1. It
deferred loading the whole image into memory until actually needed. You could
then load only one block of the image as required. This basic idea has been
passed into vil2/vil. However, as with vil1, this is only a design outline and
API. A lot of implementation to do block-by-block processing efficiently is
missing.</p>
<p> The first step is to add <code>preferred_block_size</code>
and <code>preferred_block_origin</code> to <code>vil_property.h</code>. Both of
these <code>unsigned[2]</code> properties would be provided by file resources
and used by the processing resources to do the work one block at a time. I guess
many of the existing file resources will specify blocks that are one or more
full rasters. Those file formats designed to deal with large images (including
TIFF) can store their pixels in rectangular tiles. The processing resources should be able to use several of their
supplier's blocks together up to some locally decided (or globally specified)
limit. The processing resources would need to export these properties including
any modification (e.g. <code>vil_transpose()</code>.)&nbsp;</p>
<p>Quite how <code>vil_convolve_1d_resource</code><code> </code>should deal with
a single column for a preferred block size (when <code>vil_convolve_1d_resource</code>
operates on rows) - I don't know. It would take some experimentation to get an
efficient solution. Until then we can always rely on the fact that the <code>preferred_block_size</code>,
etc are just hints and can safely be ignored at the loss of efficiency. IMS.</p>
<p>&nbsp;</p>
</body>
</html>
